{Reinforcement Learning}是什么意思_{Reinforcement Learning}的中文翻译|音标|读音|用法|短语例句|近反义词

1、

This paper elaborates on the low learning efficiency in reinforcement learning due to improper generalization and random exploration policy under deterministic MDPS and proposes a hierarchical reinforcement learning algorithm based on system model.

针对强化学习算法的状态值泛化和随机探索策略在确定性MDP系统控制中存在着学习效率低的问题，本文提出基于模型的层次化强化学习算法。

互联网摘选

2、

Considering discrete and successive state space separately, a entropy based reinforcement learning algorithm and an auto-generating neural network function approximator method for reinforcement learning are researched.

分别针对离散化的状态空间和连续状态空间的压缩问题，提出了基于信息熵的强化学习算法和基于自主生成神经网络函数逼近器的强化学习算法。

互联网摘选

3、

This fault monitoring policy is based on the model of multi-agent Markov Decision Processes and makes use of the reinforcement learning mechanism.

应用多代理马尔可夫决策过程，建立了一种新的多管理者网络故障监控机制，并给出了该机制下基于强化学习的轮询策略。

互联网摘选

4、

Reinforcement Learning Technology in Multi-Agent System

多Agent系统中强化学习的研究现状和发展趋势

互联网摘选

5、

Comparative Analysis of Single-Agent Reinforcement Learning and Multi-Agent Reinforcement Learning

单agent强化学习与多agent强化学习比较研究

互联网摘选

6、

The Study of Multi-Agent Reinforcement Learning Methods for Cooperative Team

多Agent协作团队的强化学习方法研究

互联网摘选

7、

For the problem of multi-wheel coordination in motion control of lunar rover, an adaptive control method based on hybrid policy gradient reinforcement learning has been proposed.

针对月球车运动控制中的多轮协调问题，提出了一种基于混合策略梯度增强学习的自适应控制方法。

互联网摘选

8、

Intention Tracking Based Reinforcement Learning Agent Model

一种基于意图跟踪和强化学习的agent模型

互联网摘选

9、

The extension of reinforcement learning to MDPs with large state, action space and high complexity has inevitably encountered the problem of the curse of dimensionality, which results in slow convergence and long training time.

传统的强化学习算法应用到大状态、动作空间和任务复杂的马尔可夫决策过程问题时，存在收敛速度慢，训练时间长等问题。

互联网摘选

10、

A Multi-agent Cooperative Reinforcement Learning Algorithm Based on Team Markov Game

一种基于团队马尔可夫博弈的多agent协同强化学习算法

互联网摘选

11、

The problems on discounting reinforcement learning are analyzed. Several experiments have been performed for comparing the influence of different discounting factors on SARSA(λ) algorithm based on MDPs. The role of average reward scalar to undiscounted SARSA(λ) algorithm is also discussed.

分析了折扣激励学习存在的问题，对MDPs的SARSA（λ）算法进行了折扣的比较实验分析，讨论了平均奖赏常量对无折扣SARSA（（）算法的影响。

互联网摘选

12、

Hierarchical reinforcement learning ( HRL) was presented to combat the curse of dimensionality, and has made great progresses.

分层强化学习（HRL）是为解决强化学习的维数灾问题而提出的，并取得了显著进展。

互联网摘选

13、

During the research on the theoretical framework of policy gradient reinforcement learning, it is proved that the gradient estimation formulas of all the existing policy gradient algorithms can be uniformed.

本文的创新点和研究成果主要包括：1、在策略梯度增强学习理论框架的研究中，证明了现有策略梯度增强学习算法的梯度估计公式都符合统一的形式。

互联网摘选

14、

Q learning is of great importance in reinforcement learning.

Q学习是一种重要的强化学习算法。

互联网摘选

15、

On-Policy Modeless Reinforcement Learning Algorithms for Average-Payoff MDPs

平均奖赏MDP的在策略无模型激励学习算法

互联网摘选

16、

The fuzzy policy gradient reinforcement learning, which incorporates a priori knowledge by using fuzzy inference systems, has been studied in this dissertation.

研究了利用模糊推理系统引入先验知识的策略梯度增强学习算法。

互联网摘选

17、

S(λ): A reinforcement learning algorithm based on average-payoff MDPs

S（λ）：一个基于平均奖赏MDPs的激励学习算法

互联网摘选

18、

A multi-agent coordination model and corresponding algorithm based on distributed reinforcement learning are proposed.

提出了一种基于分布式强化学习的多Agent协调模型并给出了相应的算法。

互联网摘选

19、

Agent reinforcement learning is an important branch of machine learning.

Agent强化学习是机器学习的一个重要分支。

互联网摘选

20、

The concepts of Markov decision process and reinforcement learning are introduced firstly.

论文首先介绍了马尔可夫决策过程的基本概念和再励学习的框架。

互联网摘选

21、

Survey of Multi-agent Reinforcement Learning in Markov Games

随机博弈框架下的多agent强化学习方法综述

互联网摘选

22、

A hybrid defense about holistic and partial model is presented based on the reinforcement learning, its purpose is in order to enhance the defense strength of the whole ball team through role exchange of agents, it is superior to the traditional model.

基于强化学习原理提出了整体与局部混合防御模型，通过智能体之间的角色转换提高防守能力，与传统相比更有优势。

互联网摘选

23、

The Optimal Reward Baseline for Policy-Gradient Reinforcement Learning

策略梯度强化学习中的最优回报基线

互联网摘选

24、

Research on Multi Agent Reinforcement Learning Based Dynamic Coordination Mechanism for Wartime Spares Support

基于多Agent强化学习的战时备件供应保障动态协调机制

互联网摘选

25、

The coordination behavior level used reinforcement learning to strengthen the robots' intelligence.

协调行为层应用强化学习法增强了机器人群体的智能性；

互联网摘选

26、

In this paper, we analyze some reinforcement learning methods, which are Value-based reinforcement learning ( VBRL), Policy-Gradient reinforcement learning and Actor-Critic reinforcement learning etc.

本文分析了几种强化学习方法，包括基于值函数（Value-Based）近似方法、策略梯度方法（Policy gradient）、以及Actor-Critic方法等。

互联网摘选

27、

Application and development of reinforcement learning theory in power systems

强化学习理论在电力系统中的应用及展望

互联网摘选

28、

Two fuzzy policy gradient reinforcement learning algorithms are proposed for Markov Decision Processes with discrete and continous actions, respectively.

本文分别针对具有离散行为空间和连续行为空间的马氏决策问题，提出了两种模糊策略梯度增强学习方法（Fuzzy Policy Gradient：FPG）。

互联网摘选

29、

A hybrid policy gradient reinforcement learning control method is proposed to solve this complex optimation control problem with difficulty in obtaining teacher signals and designing fuzzy rules.

针对这种导师信号难以获取、模糊规则难以制定的复杂优化控制问题，本文提出了一种基于混合式策略梯度增强学习PG-SVM的多轮协调控制方法。

互联网摘选

30、

Research and Application on Reinforcement Learning and Communication Technology in Agent

Agent的强化学习与通信技术研究及应用

互联网摘选